set.seed(15)

library(ggplot2)
library(reshape2)

Description of the sinusoidal model

The observations along the tusk are denoted \(Y_i\) for \(i=1, \ldots, n\), with the corresponding position along the tusk denoted \(x_i\). The model is the following \[Y_i = f(x_i, \varphi)+\varepsilon_i \] with \(\varepsilon_i\) a random noise assumed to be normally distributed with mean \(0\) and variance \(\omega^2\). The regression function is a periodic sinusoidal function \[f(x, \varphi) = A \sin(g(x)+b) + B\sin(2g(x)+2b+\pi/2) \] with function \(g\) defined as \[ g(x) = ax+\xi_x\] and finally \(\xi_x\) is assumed to be a random Ornstein-Uhlenbeck process \[d\xi_x = -\beta \xi_xdx+\sigma dW_x \]

The objective is to estimate the parameters \(\varphi = (A,B,a,b)\); \(\psi = e^{\beta\Delta}\) where \(\Delta\) is the step size between two observations and \(\gamma^2 = \frac{\sigma^2}{2\beta}(1-e^{-2\beta\Delta})\).

Simulation of trajectories

Let us start with some simulations.

source("../../R/2_tusk/utils/simulation.R")
xi <- rxi()
Y <- f(xi) + rnorm(n, 0, omega)
plot(A * sin(g(rep(0, n), a) + b), type = "l")

plot(B * sin(2 * g(rep(0, n), a) + 2 * b + pi / 2), type = "l")

plot(f(rep(0, n)), type = "l")

plot(xi, type = "l")

plot(Y, type = "l")

Estimation with the EM algorithm

The EM algorithm is based on the complete log-likelihood of the model. The solution of the hidden process \(\xi_x\) is \[\xi_{x+\Delta} = \xi_x e^{-\beta\Delta} + \int_{x}^{x+\Delta} \sigma e^{\beta(s-x)}dW_s \] such that the transition density is \[p(\xi_{x+\delta}|\xi_x) = \mathcal{N}(\xi_x e^{-\beta\Delta}, \frac{\sigma^2}{2\beta}(1-e^{-2\beta\Delta}))\]

The complete log-likelihood is thus \[\begin{eqnarray*} \log L(Y,\xi,\theta)&=&\sum_{i=1}^n\log p(Y_i|\xi_i) +\sum_{i=1}^n\log p(\xi_{i}|\xi_{i-1}) +\log p(\xi_1)\\ &=& -\sum_{i=1}^n \frac{(Y_i-f(x_i, \varphi))^2}{2\omega^2}-\frac{n}{2}\log(\omega^2)\\ &&- \sum_{i=1}^n\frac{(\xi_i-\xi_{i-1}e^{-\beta\Delta})^2}{\frac{\sigma^2}{\beta}(1-e^{-2\beta\Delta})} - \frac{n}{2}\log(\frac{\sigma^2}{2\beta}(1-e^{-2\beta\Delta})) \\ &=& -\sum_{i=1}^n \frac{(Y_i-f(x_i, \varphi))^2}{2\omega^2}-\frac{n}{2}\log(\omega^2)\\ &&- \sum_{i=1}^n\frac{(\xi_i-\xi_{i-1}\psi)^2}{2\gamma^2} - \frac{n}{2}\log(\gamma^2) \end{eqnarray*}\]

The EM algorithm proceeds at iteration \(k\) with the two following steps, given the current value of the parameters \(\theta^k\)

E step

The condition distribution \(p(\xi|Y; \theta^k)\) is not explicit because the regression function is not linear. We should proceed with a MCMC algorithm to sample from this distribution. This will lead to a stochastic version of the EM algorithm, namely the SAEM algorithm.

M step

We need the sufficient statistics to update the algorithm.

The statistics are

\[\begin{eqnarray*} S_1(\xi_i)& =& \frac{1}{n}\sum_{i=1}^n (Y_i -f(x_i(\xi_i), \varphi))^2\\ S_2(\xi_i) &=& \sum_{i=1}^n \xi_{i-1}\xi_i\\ S_3(\xi_i) &=& \sum_{i=1}^n \xi_{i-1}^2\\ S_4(\xi_i) &=& \sum_{i=1}^n \xi_i^2 \end{eqnarray*}\]

The update of the parameters are based on these statistics.

SAEM algorithm

The steps of the SAEM algorithm are

  • E step: simulation of a new trajectory \(\xi^k\) with a MCMC algorithm targeting \(p(\xi|Y; \theta^k)\) as stationary distribution

  • SA step: stochastic approximation of the sufficient statistics

\[\begin{eqnarray*} s_1^k &=& s_1^{k-1} + (1-\alpha_k)(S_1(\xi^k)-s_1^{k-1}) \\ s_2^k &=& s_2^{k-1} + (1-\alpha_k)(S_2(\xi^k)-s_2^{k-1}) \\ s_3^k &=& s_3^{k-1} + (1-\alpha_k)(S_3(\xi^k)-s_3^{k-1}) \\ s_4^k &=& s_4^{k-1} + (1-\alpha_k)(S_4(\xi^k)-s_4^{k-1}) \\ \end{eqnarray*}\]

  • M step: update of \(\theta^{k}\) using the sufficient statistics \(s^k\)

\[\begin{eqnarray*} \widehat{\varphi}^k &=& \arg\min_\varphi \sum_{i=1}^n \left(y_i - f(x_i(\xi_i^k), \varphi)\right)^2 \\ \widehat{\psi}^k &=& \frac{s_2^k}{s_3^k}\\ \widehat{\omega^{2}}^k &=& s_1^k\\ \widehat{\gamma^{2}}^k &=& \frac{1}{n}(\widehat{\psi^2}^ks_3^k-2\widehat{\psi}^ks_2^k+s_4^k) \end{eqnarray*}\]

MCMC

source("../../R/2_tusk/utils/MCMC.R")
M <- 150
mcmc.obj <- mcmc.alg(Y, M)

Tirage

plot(xi, type = "l", col = "blue",
     ylim = c(min(min(mcmc.obj$xi.c), min(xi)), max(max(mcmc.obj$xi.c), max(xi))))
lines(mcmc.obj$xi.c, type = "l", col = "red")

plot(Y, type = "l", col = "blue")
lines(f(mcmc.obj$xi.c), type = "l", col = "red")

plot(mcmc.obj$acceptance.rate[mcmc.obj$n.it,])
abline(h = acceptance.target, lty = "dashed", col = "red")

Ecarts

worst.idx <- which.max(abs(xi - mcmc.obj$xi.c))
ggplot(data = data.frame(x = 1:M), aes(x = x)) +
  geom_line(aes(y = mcmc.obj$xi[, worst.idx]), color = "blue") +
  geom_line(aes(y = mcmc.obj$xi.p[, worst.idx]), color = "red") +
  geom_hline(aes(yintercept = xi[[worst.idx]])) +
  ylab("Y")

best.idx <- which.min(abs(xi - mcmc.obj$xi.c))
ggplot(data = data.frame(x = 1:M), aes(x = x)) +
  geom_line(aes(y = mcmc.obj$xi[, best.idx]), color = "blue") +
  geom_line(aes(y = mcmc.obj$xi.p[, best.idx]), color = "red") +
  geom_hline(aes(yintercept = xi[[best.idx]])) +
  ylab("Y")

farest.idx <- which.max(abs(xi))
ggplot(data = data.frame(x = 1:M), aes(x = x)) +
  geom_line(aes(y = mcmc.obj$xi[, farest.idx]), color = "blue") +
  geom_line(aes(y = mcmc.obj$xi.p[, farest.idx]), color = "red") +
  geom_hline(aes(yintercept = xi[[farest.idx]])) +
  ylab("Y")

coeff <- max(mcmc.obj$acceptance.rate[, worst.idx]) / max(mcmc.obj$delta[, worst.idx])
ggplot(data = data.frame(x = 1:M), aes(x = x)) +
  geom_line(aes(y = mcmc.obj$acceptance.rate[, worst.idx]), color = "red") +
  geom_line(aes(y = mcmc.obj$delta[, worst.idx] * coeff), color = "blue") +
  geom_hline(aes(yintercept = acceptance.target * 1.1), linetype = 2, color = "red") +
  geom_hline(aes(yintercept = acceptance.target * .9), linetype = 2, color = "red") +
  scale_y_continuous(name = "Acceptance rate", sec.axis = sec_axis(~. / coeff, name = "Delta"))

coeff <- max(mcmc.obj$acceptance.rate[, best.idx]) / max(mcmc.obj$delta[, best.idx])
ggplot(data = data.frame(x = 1:M), aes(x = x)) +
  geom_line(aes(y = mcmc.obj$acceptance.rate[, best.idx]), color = "red") +
  geom_line(aes(y = mcmc.obj$delta[, best.idx] * coeff), color = "blue") +
  geom_hline(aes(yintercept = acceptance.target * 1.1), linetype = 2, color = "red") +
  geom_hline(aes(yintercept = acceptance.target * .9), linetype = 2, color = "red") +
  scale_y_continuous(name = "Acceptance rate", sec.axis = sec_axis(~. / coeff, name = "Delta"))

xi au fil des k

par(mfrow = c(2, 2))
for (k.ind in round(seq(1, M, length.out = 10))) {
  plot(xi, type = "l", col = "blue",
       ylim = c(min(min(mcmc.obj$xi[k.ind,]), min(xi)), max(max(mcmc.obj$xi[k.ind,]), max(xi))),
       main = paste("k=", k.ind))
  lines(mcmc.obj$xi[k.ind,], type = "l", col = "red")
}

f au fil des k

par(mfrow = c(2, 2))
for (k.ind in round(seq(1, M, length.out = 10))) {
  plot(Y, type = "l", col = "blue",
       main = paste("k=", k.ind))
  lines(f(mcmc.obj$xi[k.ind,]), type = "l", col = "red")
}

Identifiabilité

p1 <- rxi()
p2 <- rxi()
p3 <- rxi()
p4 <- rxi()

par(mfrow = c(2, 2))

plot(p1, type = "l", col = 1)

plot(p1, type = "l", col = 1, ylim = c(min(c(min(p1), min(p2))), max(c(max(p1), max(p2)))))
lines(p2, type = "l", col = 2)

plot(p1, type = "l", col = 1, ylim = c(min(c(min(p1), min(p3))), max(c(max(p1), max(p3)))))
lines(p3, type = "l", col = 3)

plot(p1, type = "l", col = 1, ylim = c(min(c(min(p1), min(p4))), max(c(max(p1), max(p4)))))
lines(p4, type = "l", col = 4)

par(mfrow = c(2, 2))

plot(f(p1), type = "l", col = 1)

plot(f(p1), type = "l", col = 1)
lines(f(p2), type = "l", col = 2)

plot(f(p1), type = "l", col = 1)
lines(f(p3), type = "l", col = 3)

plot(f(p1), type = "l", col = 1)
lines(f(p4), type = "l", col = 4)

par(mfrow = c(2, 2))
plot(Y, type = "l", col = 1, main = "goal")

plot(Y, type = "l", col = 1, main = "MCMC")
lines(f(mcmc.obj$xi.c), type = "l", col = 2)

plot(Y, type = "l", col = 1, main = "xi simulation")
lines(f(mcmc.obj$xi.c), type = "l", col = 2)
lines(f(p1), type = "l", col = 3)

plot(Y, type = "l", col = 1, main = "xi simulation")
lines(f(mcmc.obj$xi.c), type = "l", col = 2)
lines(f(p2), type = "l", col = 4)

EM

source("../../R/2_tusk/utils/SAEM.R")
saem.obj <- saem.alg(Y)
plot(xi, type = "l", col = "blue",
     ylim = c(min(min(saem.obj$mcmc$xi.c), min(xi)), max(max(saem.obj$mcmc$xi.c), max(xi))))
lines(saem.obj$mcmc$xi.c, type = "l", col = "red")

plot(Y, type = "l", col = "blue")
lines(f(saem.obj$mcmc$xi.c, A.arg = saem.obj$A.c, B.arg = saem.obj$B.c, a.arg = saem.obj$a.c, b.arg = saem.obj$b.c),
      type = "l", col = "red")

plot(f(rep(0, n)), type = "l")
lines(f(rep(0, n), A.arg = saem.obj$A.c, B.arg = saem.obj$B.c, a.arg = saem.obj$a.c, b.arg = saem.obj$b.c),
      type = "l", col = "red")

par(mfrow = c(2, 2))

plot(saem.obj$omega, type = 'l', col = 'blue', main = "Omega", ylim = c(min(c(saem.obj$omega, omega)), max(saem.obj$omega, omega)))
abline(h = omega, col = 'red')

plot(saem.obj$psi, type = 'l', col = 'blue', main = "Psi", ylim = c(min(c(saem.obj$psi, psi)), max(saem.obj$psi, psi)))
abline(h = psi, col = 'red')

plot(saem.obj$gamma, type = 'l', col = 'blue', main = "gamma", ylim = c(min(c(saem.obj$gamma, gamma)), max(saem.obj$gamma, gamma)))
abline(h = gamma, col = 'red')

plot(saem.obj$A, type = 'l', col = 'blue', main = "A", ylim = c(min(c(saem.obj$A, A)), max(saem.obj$A, A)))
abline(h = A, col = 'red')

plot(saem.obj$B, type = 'l', col = 'blue', main = "B", ylim = c(min(c(saem.obj$B, B)), max(saem.obj$B, B)))
abline(h = B, col = 'red')

plot(saem.obj$b, type = 'l', col = 'blue', main = "b", ylim = c(min(c(saem.obj$b, b)), max(saem.obj$b, b)))
abline(h = b, col = 'red')

plot(saem.obj$a, type = 'l', col = 'blue', main = "a", ylim = c(min(c(saem.obj$a, a)), max(saem.obj$a, a)))
abline(h = a, col = 'red')

Plan d’expérience

params.true <- c(omega = omega, psi = psi, gamma = gamma, A = A, B = B, a = a, b = b)
params.est <- readRDS("../../data/2_tusk/parameters.est.rds")
apply(params.est, 2, mean)
##       omega         psi       gamma           A           B           a 
##  0.01784953  0.93182377  0.09719714  0.49804310 -0.24480915  0.10000059 
##           b 
##  1.01001660
params.pe <- t(apply(params.est, 1, function (row) (params.true - row) / params.true) * 100)
params.mape <- apply(abs(params.pe), 2, mean)
params.se <- apply(params.est, 1, function (row) (params.true - row)^2)
params.rmse <- sqrt(apply(params.se, 1, mean))
df <- melt(params.pe, varnames = c("it", "param"))
ggplot(data = df) +
  geom_boxplot(aes(x = param, y = value))

params.mape
##      omega        psi      gamma          A          B          a          b 
## 78.4953366  2.1691948  6.0136665  0.6686644  2.1343387  0.4716401 14.3315989
params.rmse
##        omega          psi        gamma            A            B            a 
## 0.0084336743 0.0265938232 0.0078524878 0.0318171392 0.0062878441 0.0005930824 
##            b 
## 0.1969258133